Cumulative residual cholesterol predicts the risk of cardiovascular disease in the general population aged 45 years and older

Background Numerous studies have affirmed a robust correlation between residual cholesterol (RC) and the occurrence of cardiovascular disease (CVD). However, the current body of literature fails to adequately address the link between alterations in RC and the occurrence of CVD. Existing studies have focused mainly on individual RC values. Hence, the primary objective of this study is to elucidate the association between the cumulative RC (Cum-RC) and the morbidity of CVD. Methods The changes in RC were categorized into a high-level fast-growth group (Class 1) and a low-level slow-growth group (Class 2) by K-means cluster analysis. To investigate the relationship between combined exposure to multiple lipids and CVD risk, a weighted quantile sum (WQS) regression analysis was employed. This analysis involved the calculation of weights for total cholesterol (TC), low-density lipoprotein (LDL), and high-density lipoprotein (HDL), which were used to effectively elucidate the RC. Results Among the cohort of 5,372 research participants, a considerable proportion of 45.94% consisted of males, with a median age of 58. In the three years of follow-up, 669 participants (12.45%) had CVD. Logistic regression analysis revealed that Class 2 individuals had a significantly reduced risk of developing CVD compared to Class 1. The probability of having CVD increased by 13% for every 1-unit increase in the Cum-RC according to the analysis of continuous variables. The restricted cubic spline (RCS) analysis showed that Cum-RC and CVD risk were linearly related (P for nonlinearity = 0.679). The WQS regression results showed a nonsignificant trend toward an association between the WQS index and CVD incidence but an overall positive trend, with the greatest contribution from TC (weight = 0.652), followed by LDL (weight = 0.348). Conclusion Cum-RC was positively and strongly related to CVD risk, suggesting that in addition to focusing on traditional lipid markers, early intervention in patients with increased RC may further reduce the incidence of CVD. Supplementary Information The online version contains supplementary material available at 10.1186/s12944-023-02000-0.


Introduction
Cardiovascular disease (CVD) stands as the foremost cause of both mortality and disability in China [1,2].In the past three decades, the incidence of CVD in China has risen dramatically [3].Between 2005 and 2020, the overall burden of premature deaths from CVD in China was greater than the global average, far exceeding that in some middle-and high-income countries [4,5].CVD has emerged as a major public health issue that threatens people's health and well-being.From the perspective of population epidemiology, identifying simple, economical, and reproducible indicators to establish a CVD risk prediction model has become a popular research topic in recent years and can help better identify susceptible groups at high-risk of CVD.
Dyslipidemia serves as a notable and independent risk factor in the onset and progression of cardiovascular events.For a long time, lipid-lowering targets for dyslipidemia prevention and treatment have focused mainly on low-density lipoprotein (LDL) levels.However, even when LDL reaches lipid-lowering target values, the risk of major adverse cardiovascular events remains, referred to as residual risk [6,7].There is growing evidence that triglyceride (TG) and/or triglyceride-rich lipoprotein (TRL) cholesterol levels may contribute to this residual risk [8].Therefore, recent studies have gradually focused on residual cholesterol (RC).Several studies have proven the importance of RC in predicting CVD incidence and its prognosis independent of LDL [9,10].Monitoring RC levels may help determine potential CVD risk not reflected by LDL.Recent research [11] revealed that RC was strongly linked to the incidence of metabolic syndrome and the occurrence of CVD.This study is limited by the use of a single database and only used data from a single measurement at baseline, which may not allow us to observe trends in disease risk.Currently, most studies have focused on baseline RC levels.There are a limited number of studies examining the relationship between alterations in RC levels and CVD incidence.Baseline RC levels provide information only about static factors, while dynamically changing RC levels may better reflect an individual's cholesterol metabolism and trends.Studying changes in RC levels can provide a better understanding of how cholesterol levels fluctuate in individuals during treatment and thus allow a more accurate assessment of the risk of CVD.
The data source is the China Health and Retirement Longitudinal Study (CHARLS) database, which contains numerous high-quality microdata encompassing the household and individual profiles of middle-aged and older adults aged 45 and older in China.Compared with previous studies that used only single measurements of RC, cumulative RC (Cum-RC) was used in this study to explore the characteristics of populations with different RC trends and to provide a holistic and comprehensive view of the impact of dynamic alterations in RC on the incidence of CVD and older to make the results more in line with real-world conditions.

Study population
The CHARLS database provided the data for this analysis.The CHARLS national baseline survey was undertaken in 2011 (Wave 1), and further waves of the study were performed in 2013 (Wave 2), 2015 (Wave 3), and 2018 (Wave 4) in 150 counties and 450 urban and rural community neighborhood committees throughout 28 provinces.The county/district and village sampling levels used a probability proportional to size (PPS) approach.Prior to participation, all subjects provided written informed consent [12].
Data for 17,708 study subjects were collected from the 2011 baseline survey as the initial population, and those who met the study objectives were selected according to the following inclusion criteria: (1) aged ≥ 45 at Wave 1; (2) had total cholesterol (TC), high-density lipoprotein (HDL), and LDL levels at Wave 1 and Wave 3; (3) had not yet suffered from CVD, including heart disease and stroke, at Wave 1 and Wave 3; and (4) had CVD status information recorded at Wave 4. Ultimately, 5,372 patients were included in the study population.Figure 1 depicts the specific filtering procedure for respondents.

Evaluation of RC and Cum-RC
The Cum-RC values from 2012 to 2015 were used for this study.The formula RC = TC -HDL -LDL [13] was used to calculate the RC values.In addition, the cumulative level of RC between 2012 and 2015 was calculated based on the formula Cum-RC = (RC 2012 + RC 2015 )/2 × time (2015 − 2012) [14].

Definition of CVD
New-onset CVD was the primary outcome of this research.According to a previous CHARLS-related study, CVD incidence was ascertained by the patient's answers to the questionnaire question in Wave 4: "Has your doctor ever told you that you have a heart-related illness (including angina, myocardial infarction, coronary artery disease, congestive heart failure, or other heart disease) or had a stroke?"If the participant answered "yes", he or she was defined as having experienced a cardiovascular event [15,16].

Covariates
Baseline data were collected from in-person interviews of study participants by staff trained in questionnaire administration.The questionnaire covered demographic information (age, sex, residence, marriage), body mass index (BMI), health status (hypertension, dyslipidemia, and diabetes), lifestyle information (smoking, alcohol consumption), medication use (antihypertensive, lipidlowering, hypoglycemic), and socioeconomic status (education).The residences were categorized as urban or rural [12].Disease history included hypertension, dyslipidemia, and diabetes.In terms of education level, the study participants were categorized into four groups: no education, primary education, secondary education, and college education and above.Fasting venous blood collection was performed by professionally trained personnel, and TC, LDL, HDL, FPG, glycosylated hemoglobin A1c (HbA1c), and uric acid (UA) were measured.

Statistical analysis
Given the non-normal distribution of the study measurements, they were reported using the median and interquartile range (IQR), and counts were statistically described using frequency and percentage [n (%)].Multiple imputations were used to fill in missing data to maximize statistical power and mitigate any bias that may result from missing data [17].Information on missing variables can be found in Table S1.When grouping Cum-RCs, two methods, namely, K-means clustering and tertile grouping, were tested.To investigate the relationship between Cum-RC and the development of newonset CVD, a logistic regression model was employed, adjusting for potential confounding factors.The K-means clustering algorithm is an iterative technique employed to cluster data by utilizing distance as a measure of similarity in order to divide a given dataset into K distinct classes.In the clustering process, each class is characterized by a clustering center, which is determined by calculating the mean value of all the data points within that particular class [18,19].
The process of K-means clustering can be briefly summarized in the following steps.First, a collection of k data points is randomly selected from the dataset to serve as the initial clustering centers.Then, all the data points in the dataset are traversed and each data point is assigned to the category corresponding to the cluster center nearest to it.Next, the clustering centers for each category are recalculated, and a new category is derived by calculating the mean value of all the data points within the class.The above steps are repeated until all the data points reach the minimum sum of the distances to the clustering centers of the classes to which they belong [20].
In this paper, the clustering effect was evaluated through the silhouette coefficient, which dynamically determines the range of values of K.The silhouette coefficient was first proposed by Peter J. Rousseeuw in 1986 based on the comparison of closeness and separation and can be used to choose an optimal number of clusters and provide an assessment of clustering effectiveness [21].The silhouette coefficients, ranging from − 1 to Fig. 1 Flowchart for screening the research subjects + 1, provide a measure of the similarity between sample points and their respective clusters, with values closer to 1 suggesting a strong fit within their assigned clusters and a weaker fit with neighboring clusters [22].In the evaluation process, each profile coefficient corresponds to a specific value of K. Therefore, a reasonable range of values of K can be determined based on the higher profile coefficients.The silhouette coefficient relationship graph (Fig. S1) shows that the clustering effect is optimal when K is 2.
Figure 2A shows how the clustered population was divided.In Fig. 2B RCS modeling was used to explore the dose-response relationship between the Cum-RC and CVD risk.Subgroup analyses were performed.In addition, an analysis was performed to determine whether there was an interaction effect between these risk factors and the Cum-RC on the development of CVD and test for trends in exposure levels in different subgroups.
In addition, a weighted quantile sum (WQS) regression model was utilized to explore the overall association between exposure to the three lipids (TC, LDL, and HDL) among the RC components and CVD risk and to establish the relative contribution of each lipid to CVD risk [23].In the WQS regression, the weight values for exposure range from 0 to 1, and the sum of the weights is 1 [24].A higher weight value indicates a greater degree of contribution of the component exposure to the overall load.WQS regression was used to assess the association between combined exposure to the three lipids as a whole and CVD risk.The exposure level of each lipid was converted into an ordinal variable that was weighted and summed in interquartile form to obtain the sum of the weighted quartiles of all exposure elements (WQS index).The WQS index represents the overall exposure load of the three lipids and was combined with the covariates above in a regression model reflecting the effect of combined exposure on outcome [25].
The statistical analyses were completed using Stata 16.0 and R 4.1.1software.

Baseline characteristics
There were 5,372 people in this study, with a median age of 58 years.A total of 45.94% of the participants were male, 53.02% had completed elementary school and above, and 47.82% were rural dwellers.Based on data from 2012 to 2015, the medians (IQRs) for RC were 19.72 (11.60, 32.09) mg/dL × years and 25.87 (19.69, 35.91) mg/ dL × years, respectively.The median (IQR) for Cum-RC was 70.13 (51.55, 100.30) mg/dL × years.Compared to those in the Class 1 group, the Class 2 participants were younger; had a greater proportion of males, smokers, and alcoholics; were more educated; had a lower BMI, systolic blood pressure (SBP), and diastolic blood pressure (DBP); had a lower incidence of hypertension and dyslipidemia; and had lower levels of FPG, Hba1c, UA, TC, HDL, LDL, and Cum-RC (P < 0.05).In addition, tertiles of Cum-RC levels were used to divide participants into 3 groups (Table 1).Compared with those in the first tertile (T1), participants in the third tertile (T3) were younger; had a greater proportion of females and nonsmokers; had a greater BMI; and had a greater incidence of hypertension, dyslipidemia, and diabetes (P < 0.001).In addition, increased Cum-RC was positively correlated with SBP, DBP, FPG, HbA1c, UA, TC, LDL, RC 2012 , and RC 2015 and negatively correlated with HDL (P < 0.001).

Associations between Cum-RC scores and new-onset CVD incidence
Table 2 shows that after 3 years of follow-up, 669 participants (12.45%) developed CVD, 407 (7.58%) had heart disease, and 300 (5.58%) had stroke.There was a lower risk of CVD in the Class 2 subgroup than in the Class 1 subgroup, with a lower risk of heart disease; moreover, the risk of stroke occurrence did not significantly differ.In addition, a comparison of T3 with T1 in the Cum-RC cohort revealed a risk of CVD (OR = 1.27, 95% CI = 1.02-1.58),heart disease (OR = 1.17, 95% CI = 0.90-1.54),and stroke (OR = 1.51, 95% CI = 1.09-2.07).Notably, Cum-RC was significantly associated with an elevated risk of CVD and stroke (P for trend = 0.033 and 0.012, respectively).However, the increase in the risk of developing heart disease was nonsignificant (P for trend = 0.249).Notably, in the RCS regression model, a positive linear correlation between Cum-RC and CVD risk was observed (P for nonlinearity = 0.679) (Fig. 3).

Subgroup analysis
In subgroup analyses, an interaction between Cum-RC and age as well as hypertension was found (Table 3).Among participants who were < 60 years of age, female, lived in rural areas, married, had a BMI < 24 kg/m 2 , did not smoke, did not drink alcohol, were not hypertensive, had normal lipids, or did not have diabetes, Class 2 was associated with a lower risk of CVD (P < 0.05) (Table 3).In addition, a subgroup analysis of Cum-RC tertiles was performed and did not reveal an interaction between Cum-RC and subgroup variables.However, among participants who were male, lived in rural areas, smoked, drank alcohol, had a BMI < 24 kg/m 2 and were free of dyslipidemia, the risk of CVD increased with increasing Cum-RC (P < 0.05) (Table S2).

Joint lipid exposure analysis based on WQS analysis
An in-depth analysis of TC, LDL, and HDL levels in the Cum-RC was performed using the WQS regression model.The model assessed the association of cumulative TC (Cum-TC), cumulative LDL (Cum-LDL), and cumulative HDL (Cum-HDL) exposures with CVD risk.The WQS regression results showed that Cum-TC had the highest relative contribution weight (0.652) among the three variables, followed by Cum-LDL (Fig. 4).Although the effect of the WQS index of mixed lipids of TC, LDL, and HDL on CVD incidence was nonsignificant (OR = 1.11, 95% CI = 1.00-1.22,P > 0.05), the confidence interval did not cross 1, with an overall positive trend (Fig. 5).
However, the WQS index of mixed lipids was associated with heart disease and stroke risk (OR = 1.14, 95% CI = 0.99-1.31;OR = 1.05, 95% CI = 0.92-1.21),with confidence intervals spanning 1; moreover, the association was not significant.In addition, there was a strong correlation between Cum-TC and the risk of CVD and heart disease (P < 0.05) (Fig. 5).Cum-LDL was associated with heart disease alone (P < 0.05).In addition, Cum-HDL was inversely connected with the risk of CVD and stroke (P < 0.05).

Sensitivity analyses
A regression analysis was performed after excluding participants with extreme BMI (< 18.5 or > 30 kg/m 2 ) and dyslipidemia.The results showed that all outcomes remained virtually unchanged after excluding these participants.A statistically significant (P < 0.05) correlation was shown between high Cum-RC and increased CVD risk in both the clustered and tertile groups.Similarly, Class 2 patients had a significantly lower risk of heart disease than Class 1 patients (P < 0.05).The T3 group exhibited a significantly higher risk of stroke (P < 0.05) compared to the T1 group (Tables S3).However, no significant alteration in the risk of heart disease was observed (P > 0.05) (Table S4).

Discussion
The present investigation examined the relationship between Cum-RC and CVD risk by utilizing two distinct statistical analysis models.Cum-RC exhibited an independent association with CVD risk among individuals aged 45 years and older in the CHARLS database.The WQS model demonstrated a mixed effect of combined TC, LDL, and HDL exposures on outcomes, and the WQS index tended to correlate positively with the risk of CVD, with TC contributing the most.Ideal lipid levels are essential for reducing cardiovascular-related risks.Currently, LDL is the primary target for assessing and treating atherosclerotic cardiovascular In multivariate models, potential confounders other than grouping variables were adjusted for, including age, sex, education level, marital status, residence, BMI, smoking status, drinking status, SBP, DBP, hypertension, dyslipidemia, diabetes, lipid-lowering drugs, antihypertensive drugs, hypoglycemic drugs, FPG, HbA1c, and UA Fig. 4 Estimated weights of the three lipids for CVD.We adjusted for age, sex, education level, marital status, residence, BMI, smoking status, drinking status, SBP, DBP, hypertension, dyslipidemia, diabetes, lipid-lowering drugs, antihypertensive drugs, hypoglycemic drugs, FPG, HbA1c, and UA Fig. 3 Linear associations between Cum-RC and CVD incidence.We adjusted for potential confounders, including age, sex, education level, marital status, residence, BMI, smoking status, drinking status, SBP, DBP, hypertension, dyslipidemia, diabetes, lipid-lowering drugs, antihypertensive drugs, hypoglycemic drugs, FPG, HbA1c, and UA disease (ASCVD) risk, whereas non-HDL-C or apolipoprotein B (Apo B) are considered secondary targets [26,27].Although LDL is a common biomarker used to assess the reduction in ASCVD risk, RC has attracted increasing amounts of attention in recent years due to its potential to trigger endothelial damage and atherosclerosis with less modification than LDL [28].
Many recent studies have shown that RC has a substantial impact on CVD risk and prognostic outcomes [13,[29][30][31].The results of this analysis suggested that lower Cum-RC was linked to a lower risk of CVD.When considering LDL, however, there was no such correlation.These results are consistent with those of prior research [32,33].A cohort study of Spanish older adults revealed that cardiovascular outcomes were associated with TG and RC levels but not with LDL [32].These results were further confirmed by a substantial prospective cohort study performed in Canada [33].Furthermore, a Korean cohort study showed that RC had a marginally greater impact on CVD incidence than LDL and that high RC in combination with LDL posed a greater risk of CVD than either indication alone [34].A recent Chinese longitudinal cohort study also revealed that RC was more independently linked with atherosclerosis progression than was LDL [35].All these studies provide evidence that RC has a bearing on CVD incidence and suggest that combining RC/RC with LDL may be superior to LDL alone as an early assessment tool for CVD incidence.Therefore, RC is hypothesized to potentially become one of the primary targets for lipid-lowering treatments.Recent ASCVD prevention guidelines also recommend using non-HDL rather than LDL alone [26,36].Multiple metaanalyses [37,38] have underscored the significance of incorporating RC as a potential biomarker in the evaluation and prediction of CVD risk and adverse cardiovascular events.Prior studies have indicated a positive association between elevated RC and an elevated risk of ASCVD in diabetic patients [39].
However, this study revealed no evidence of a link between Cum-RC and CVD risk in diabetes patients following stratified analysis.The inconsistency between the results of these two studies may be related to factors such as differences in definitions of exposures and outcomes and the national and ethnic heterogeneity of the study populations.In addition, adjustment for glucoselowering medications may have been an influential factor.Importantly, in the subgroup analyses of Cum-RC data based on tertile groupings, there was no evidence of a connection between Cum-RC and CVD risk in either the hypertension cohort or the nonhypertensive population.A study from the CHARLS cohort showed a significant effect of increased RC on CVD incidence in the hypertensive population.However, no such association was observed among the group without hypertension [40].That study used only a single measurement of RC, which may be the main reason for its inconsistency with the results of the present study.Finally, in the subgroup analysis based on clustered grouped data, in the nonhypertensive population, CVD risk was significantly lower in the low-level slow-growth RC group (Class 2) than in the high-level fast-growth RC group (Class 1) (P < 0.001).The difference in results was strongly associated with the grouping method.A likely explanation is that clustering models can combine multiple variables and group datasets at multiple time points and dimensions to better understand each subgroup's characteristics and Fig. 5 WQS modeling to analyze the association between combined exposure to three lipids and CVD risk.We adjusted for age, sex, education level, marital status, residence, BMI, smoking status, drinking status, SBP, DBP, hypertension, dyslipidemia, diabetes, lipid-lowering drugs, antihypertensive drugs, hypoglycemic drugs, FPG, HbA1c, and UA differences.In contrast, considering only one variable, RC, and dividing the intervals according to quartiles did not characterize each subgroup.
RC represents the cholesterol composition within TRLs [41].There are several explanations for the mechanism by which RC contributes to ASCVD.First, RC can reach the arterial intima at a slower rate than LDL [42].After some TG are broken down, cholesterol builds up in the intima, leading to plaque formation and the development of ASCVD [43].Second, RC is the major oxidized lipoprotein in plasma and does not require oxidation in vitro but can be as pro-inflammatory and pro-ASCVD as LDL [44].In addition, RC can cause low-grade inflammation [45].The underlying mechanism may be because lipid lipases on the surface of RC residues lead to the release of free fatty acids, monoacylglycerols, and other molecules, all of which may contribute to localized damage and inflammation [46].High levels of RC may be associated with arterial wall inflammation following endothelial injury, and persistent inflammatory stimuli may lead to hyperproliferation of vascular smooth muscle cells and neointimal hyperplasia [45,47].

Strengths and limitations
This study has several advantages.First, compared with previous RC-CVD association studies in which only single measurements were performed, the study used cumulative exposures for the analysis, thus increasing the reliability of the findings.Second, the WQS joint exposure model was created to assess complex human exposure patterns and actual exposure levels.The WQS model can evaluate the combined impact of several lipid components on CVD incidence risk and assign a relative importance weight to each lipid.It is more sensitive than single lipid models for identifying risk factors.
Several limitations should be acknowledged in this study.First, the study used calculated RC levels rather than direct measurements due to database limitations.Although calculated RC concentrations may introduce a degree of bias, it has been shown that calculated RC concentrations correlate well with direct measurements [48].Moreover, the European Atherosclerosis Society Consensus Statement advocates for the combined utilization of directly measured and calculated RC data in clinical practice [46].Currently, indirect computation methods are commonly employed in most studies because of their economic convenience and time efficiency [13,32,49].Second, because individuals without complete TC, LDL, or HDL data were excluded, selection bias may have been introduced, whereby missing data are associated with specific characteristics that are also associated with study outcomes.This could lead to underestimation or overestimation of true associations.In addition, there may be information bias due to incomplete or inaccurate data, which can affect the precision of the estimates and potentially distort the observed relationships between variables.Third, because only two blood tests were performed, more detailed information on the development of RC levels could not be obtained.Fourth, caution should be exercised when extrapolating the results of this study, as it exclusively involved participants aged 45 years and older from the Chinese population.

Conclusion
The study revealed a noteworthy association between elevated RC levels and heightened CVD risk among middle-aged and elderly individuals in the Chinese population.Specifically, the Class 1 group -characterized by a high level of rapidly increasing RC -exhibited a considerably heightened susceptibility to developing CVD.This study posited Cum-RC as a potential predictor of CVD risk, based on the observed outcomes.Aggressive RC interventions and more frequent cardiovascular monitoring appear to be necessary for high-risk patients.by all of the authors.Each contributor has reviewed the final version of the paper.
, for the Class 1 (n = 2,219) population, the RC range increased from 22.04 (12.76, 34.02) mg/dL × years in 2012 to 29.73 (22.39, 39.96) mg/dL × years in 2015 (P < 0.001), with a Cum-RC of 78.82 (57.94, 108.90) mg/dL × years; additionally, the RC showed a rapid increasing trend.For the Class 2 (n = 3,153) group, the RC range increased from 18.56 (10.82, 30.54) mg/dL × years in 2012 to 23.94 (18.15, 32.05) mg/dL × years in 2015 (P < 0.001), with a Cum-RC of 65.06 (47.51, 92.71) mg/ dL × years, and the RC showed a slow increasing trend.Moreover, Fig. 2C and D show the distributions of RCs in the Class 1 and Class 2 groups, respectively, and reveal the differences between the two groups.Notably, these data all exhibited a nonnormal distribution.

Fig. 2
Fig. 2 Analysis of changes in RC via the K-means clustering model.(A) Scatterplot visualizing the distribution of two categories of data based on K-means clustering; (B) trend of RC change in two categories of population after clustering; (C) density plot of RC distribution in 2012 for two categories of population; (D) density plot of RC distribution in 2015 for two categories of population

Table 1
Baseline characteristics

Table 2
The association between Cum-RC and CVD incidence status

Table 3
Subgroup analysis based on clustering results